Inference:
Examples

Edward Vytlacil

Event Study: Elon Musk Tweet

  • Event Study: Elon Musk Tweet

  • Audit Study: Neumark, Bank, and Van Nort (1996)

Event Study: Elon Musk Tweet

  • Elon Musk tweets image of a dog in a sweater on cover of a magazine at 5:47pm EST (10:47PM UTC) 1-28-2021

  • Tweet interpreted as supporting the cryptocurrency Dogecoin

Event Study: Elon Musk Tweet

Event Study: Elon Musk Tweet

  • Consider price of Dogecoin in two hour window before and after tweet.

  • Price of Dogecoin almost doubles.

Event Study: Elon Musk Tweet

Event Study: Elon Musk Tweet

  • Define returns: \[ R_{t}= \frac{P_{t}-P_{t-1}}{P_{t-1}}.\]

Event Study: Elon Musk Tweet

  • Model for returns with no event:
    “Constant Mean Return Model”, \[ R_{t} = \mu + \epsilon_{t}\] with \(\epsilon_t \sim N(0, \sigma_2)\), i.i.d. over \(t\).

    • Could also consider more sophisticated models for returns, for example, Market-Adjusted model based on CAPM.

Event Study: Elon Musk Tweet

  • Define

    • “Abnormal Returns”: \[ AR_{t}= R_t - \mu = e_t\]
    • “Cumulative Abnormal Returns” over period \([0,120]\): \[ CAR_{0,120}= \sum_{t=0}^{120} AR_t = \sum_{t=0}^{120} e_t\]

Event Study: Elon Musk Tweet

  • Take estimation window to be two hours before tweet, estimate \(\mu\) and \(\sigma^2\) using returns in estimation window.

  • Take event window to be two hours post-tweet.

  • Other choices for windows? Threats to validity?

Event Study: Elon Musk Tweet

Model for AR during estimation window, \(-120 \le t < 0\), \[AR_t \sim N(0, \sigma_2), ~~\mbox{i.i.d. } ~~ t,\]
Model for AR during event window, \(0\le t \le 120\), \[AR_t \sim N(\theta,\nu^2), ~~\mbox{i.i.d. } ~~ t,\] Consider: \[\mathbb{H}_0: \theta=0, \nu^2=\sigma^2, ~~ \mbox{vs} ~~\mathbb{H}_1: \theta \ne 0 ~\mbox{or}~ \nu^2 \ne \sigma^2\]

Steps for Hypothesis Testing

Steps for Hypothesis Testing:

  • State \(\mathbb{H}_0\) and \(\mathbb{H}_1\) hypotheses;
  • Decide significance level \(\alpha\);
  • Formulate test statistic, determine distribution of test statistic under the null;
  • Either:
    • Choose 1−α quantile of distribution of test statistic under the null as critical value, reject null at level α if test statistic bigger than this critical value, or
    • Construct p-value as probability under the null of observing a test statistic at least as large as the one observed in the sample, reject null at level α if p-value less than α.
  • How to proceed in this example?

Event Study: Elon Musk Tweet

# No. obs in event window
  L  <- nrow(subset(df.R,date>tweet.date))
# Cumulative AR in event window
  CAR <- sum(subset(df.R,date>tweet.date)$AR)
# inference for CAR, at 0.05 level
  test.stat  <-  CAR/sqrt(L*doge.var)
  p.value  <- 2* ( 1 - pnorm(abs(test.stat)))
  test.stat
[1] 6.220559
  qnorm(.975)
[1] 1.959964
  p.value
[1] 0.0000000004953882
# equivalentlly, using mean
  test.stat2  <-  (CAR/L)/sqrt(doge.var/L)
  p.value2  <- 2* ( 1 - pnorm(abs(test.stat2)))
  test.stat2
[1] 6.220559
  p.value2 
[1] 0.0000000004953882

Event Study: Elon Musk Tweet

  • Threats to validity?

  • Could we follow same procedure with daily data?

  • What if we only had one observation in event window?

  • What if we didn’t assume returns are normally distributed?

Audit Study: Neumark, Bank, and Van Nort (1996)

  • Event Study: Elon Musk Tweet

  • Audit Study: Neumark, Bank, and Van Nort (1996)

Audit Study

“Two individuals (auditors or testers) are matched for all relevant personal characteristics other than the one that is presumed to lead to discrimination, e.g., rate, ethnicity, gender. They then apply for a job, a housing unit, or a mortgage, or begin to negotiate for a good or service. The results they achieve and the treatment they receive in the transaction are carefully observed, documented, and analyzed to determine if the outcomes reveal patterns of differential treatment on the basis of the trait studied and/or protected by anti-discrimination laws. . . ’’ Fix and Struyk (1993) link

Audit Study Methodology:

  • Use pairs of actors, one minority and one non-minority, with similar relevant attributes.

  • Send a pair of actors to apply for each job (or loan, etc)

  • See if differences in outcomes for minority vs non-minority actors.

Audit Study

Neumark, Bank, and Van Nort (1996):
Sex Discrimination in Restaurant Hiring: An Audit Study

  • Investigates sex-discrimination in restaurant hiring of waiters.

  • Selected 65 restaurants in Philadelphia, classified as high-, medium, or low-price.

  • Four auditors, two man and two woman college students.

Audit Study

Neumark, Bank, and Van Nort (1996):
Sex Discrimination in Restaurant Hiring: An Audit Study

  • Created three fake resumes for auditors:
    • resumes constructed to be similar in work experience,
    • fake resumes rotated across the auditors.
  • Sent one male auditor and one female auditor to apply for waiter positions at each restaurant.

Audit Study

Neumark, Bank, and Van Nort (1996):
Sex Discrimination in Restaurant Hiring: An Audit Study

  • 130 job applications resulted in 54 job interviews, and 39 of those resulted in job offers.

Audit Study:

Audit of Restaurants (full sample)
Offer? High-Price (n= 46 ) Med-Price (n= 42 ) Low-Price (n= 42 )
No. Frac No. Frac No. Frac
Neither 22 0.48 18 0.43 26 0.62
Both 2 0.04 8 0.19 4 0.10
Only Man 20 0.43 12 0.29 0 0.00
Only Woman 2 0.04 4 0.10 12 0.29
Source: Neumark, Bank, and Van Nort (1996).

How to conduct inference?

Steps for Hypothesis Testing

Steps for Hypothesis Testing:

  • State \(\mathbb{H}_0\) and \(\mathbb{H}_1\) hypotheses;
  • Decide significance level \(\alpha\);
  • Formulate test statistic, determine distribution of test statistic under the null;
  • Either:
    • Choose 1−α quantile of distribution of test statistic under the null as critical value, reject null at level α if test statistic bigger than this critical value, or
    • Construct p-value as probability under the null of observing a test statistic at least as large as the one observed in the sample, reject null at level α if p-value less than α.
  • How to proceed in this example?

Audit Study: Paired Sign Test

Consider Paired Sign Test:

  • State \(\mathbb{H}_0\) and \(\mathbb{H}_1\) hypotheses;

    • State null hypothesis as symmetry in 2x2 table.
  • Condition on only man or only woman outcomes.

  • Use binomial test on conditional sample.

Audit Study

# High price restaurants
binom.test(20,22)

    Exact binomial test

data:  20 and 22
number of successes = 20, number of trials = 22, p-value = 0.0001211
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.7083873 0.9887944
sample estimates:
probability of success 
             0.9090909 
# Med price restaurants
binom.test(12,16)

    Exact binomial test

data:  12 and 16
number of successes = 12, number of trials = 16, p-value = 0.07681
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.4762292 0.9273380
sample estimates:
probability of success 
                  0.75 
# Low price restaurants
binom.test(0,12)

    Exact binomial test

data:  0 and 12
number of successes = 0, number of trials = 12, p-value = 0.0004883
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.0000000 0.2646485
sample estimates:
probability of success 
                     0 

Audit Study

Audit of Restaurants (conditional sample)
Offer? High-Price (n= 22 ) Med-Price (n= 16 ) Low-Price (n= 12 )
No. Frac No. Frac No. Frac
Only Man 20 0.91 12 0.75 0 0
Only Woman 2 0.09 4 0.25 12 1
p-value symmetry 0 0.077 0
Source: Neumark, Bank, and Van Nort (1996).

Audit Study:

  • Very different results for the two woman auditors.

  • They suspect difference do to one woman auditor being Asian-American.

  • Also consider analysis dropping the audits involving the Asian-American auditor.

Audit Study:

Audit of Restaurants (Dropping Woman 2)
Offer? High-Price (n= 22 ) Med-Price (n= 22 ) Low-Price (n= 12 )
No. Frac No. Frac No. Frac
Neither 10 0.45 8 0.36 9 0.75
Both 2 0.09 6 0.27 1 0.08
Only Man 10 0.45 4 0.18 0 0.00
Only Woman 0 0.00 4 0.18 2 0.17
Source: Neumark, Bank, and Van Nort (1996).

How to conduct inference?

Audit Study

# High price restaurants
binom.test(10,10)

    Exact binomial test

data:  10 and 10
number of successes = 10, number of trials = 10, p-value = 0.001953
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.6915029 1.0000000
sample estimates:
probability of success 
                     1 
# Med price restaurants
binom.test(4,8)

    Exact binomial test

data:  4 and 8
number of successes = 4, number of trials = 8, p-value = 1
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.1570128 0.8429872
sample estimates:
probability of success 
                   0.5 
# Low price restaurants
binom.test(0,2)

    Exact binomial test

data:  0 and 2
number of successes = 0, number of trials = 2, p-value = 0.5
alternative hypothesis: true probability of success is not equal to 0.5
95 percent confidence interval:
 0.0000000 0.8418861
sample estimates:
probability of success 
                     0 

Audit Study

Audit of Restaurants (conditional sample)
Offer? High-Price (n= 10 ) Med-Price (n= 8 ) Low-Price (n= 2 )
No. Frac No. Frac No. Frac
Only Man 10 1 4 0.5 0 0
Only Woman 0 0 4 0.5 2 1
p-value symmetry 0.002 1 0.5
Source: Neumark, Bank, and Van Nort (1996).

Audit Study

  • Is not rejecting the null evidence that the null is true?

  • Including both woman auditors, we reject null for low-price restaurants (p-value: 0.00049)

  • Including only non-Asian-American auditor, we can’t reject null (p-value: 0.5)

  • Important to consider power of test (see Handout 3).

Disadvantages of Audit Studies:

  • Can be hard to make each pair of actors similar in relevant attributes, and attempting to do so can exacerbate effect of those attributes that are not similar (Heckman (1998) link).

  • Not double-blind. Behavior/belief of actors may lead to differences in outcomes.

  • Expensive, resulting in small sample sizes.

  • Not appropriate for all situations – for example, use of police force, criminal sentencing decisions. .. .

More recent “Correspondence methodology” alleviates first three issues, while exacerbating fourth.

Fix, M, and RJ Struyk. 1993. “Clear and Convincing Evidence: Measurement of Discrimination in America (Natural Field Experiments No. 0049).” The Field Experiments Website.
Heckman, James J. 1998. “Detecting Discrimination.” Journal of Economic Perspectives 12 (2): 101–16.
Neumark, David, Roy J Bank, and Kyle D Van Nort. 1996. “Sex Discrimination in Restaurant Hiring: An Audit Study.” The Quarterly Journal of Economics 111 (3): 915–41. https://doi.org/10.2307/2946676.